Skip to content

Audio: STFT Process: Add Xtensa HiFi function versions#10638

Open
singalsu wants to merge 2 commits intothesofproject:mainfrom
singalsu:stft_add_hifi_optimize
Open

Audio: STFT Process: Add Xtensa HiFi function versions#10638
singalsu wants to merge 2 commits intothesofproject:mainfrom
singalsu:stft_add_hifi_optimize

Conversation

@singalsu
Copy link
Collaborator

This patch adds to stft_process-hifi3.c the HiFi3 versions of higher complexity functions stft_process_apply_window() and stft_process_overlap_add_ifft_buffer().

The functions with no clear HiFi optimization benefit are moved from stft_process-generic.c to stft_process_common.c. Those functions move data with practically no processing to samples.

This change saves 17 MCPS (from 63 MCPS to 46 MCPS). The test was done with script run:

scripts/rebuild-testbench.sh -p mtl
scripts/sof-testbench-helper.sh -x -m stft_process_1024_256_
-p profile-stft_process.txt

The above STFT used FFT length 1024 with hop 256.

@singalsu singalsu marked this pull request as ready for review March 20, 2026 16:03
Copilot AI review requested due to automatic review settings March 20, 2026 16:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds HiFi3 SIMD implementations for STFT hot-path helpers and refactors shared, non-SIMD-specific routines into a common compilation unit to reduce MCPS.

Changes:

  • Add HiFi3 intrinsic implementations of stft_process_apply_window() and stft_process_overlap_add_ifft_buffer().
  • Move source/sink and buffer-fill helper functions from stft_process-generic.c into stft_process_common.c.
  • Introduce Kconfig SIMD level selection and update build sources to include the HiFi3 unit.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/audio/stft_process/stft_process_common.c Adds shared source/sink and FFT buffer fill helpers (moved from generic).
src/audio/stft_process/stft_process-hifi3.c New HiFi3 intrinsic implementations for windowing + overlap-add.
src/audio/stft_process/stft_process-generic.c Removes moved helpers; wraps generic implementations behind SOF_USE_HIFI(NONE, ...).
src/audio/stft_process/Kconfig.simd Adds Kconfig choice for SIMD optimization level selection.
src/audio/stft_process/Kconfig Includes the new SIMD Kconfig via rsource.
src/audio/stft_process/CMakeLists.txt Adds the HiFi3 compilation unit to the build.
Comments suppressed due to low confidence (1)

src/audio/stft_process/stft_process-hifi3.c:1

  • The function relies on 64-bit alignment and even-sample constraints but does not enforce either at runtime. Misalignment can cause load/store exceptions or significant penalties depending on the core/config, and “even samples” is already required to avoid the >> 1 infinite-loop hazard. Add an explicit alignment/size assertion (or a guarded scalar fallback when (uintptr_t)obuf->w_ptr is not 8-byte aligned or when the contiguous region before wrap is odd-length) to make failures deterministic and easier to diagnose.
// SPDX-License-Identifier: BSD-3-Clause

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

This patch adds to stft_process-hifi3.c the HiFi3 versions of
higher complexity functions stft_process_apply_window() and
stft_process_overlap_add_ifft_buffer().

The functions with no clear HiFi optimization benefit are moved
from stft_process-generic.c to stft_process_common.c. Those
functions move data with practically no processing to samples.

The stft_process_setup() function is changed to allocate buffers
with mod_balloc_align() to ensure a 32-bit sample pair or complex
number is aligned for 64 bit xtensa SIMD. This patch also adds
checks to other parameters to ensure the STFT is set up in a
way that can be executed.

The patch also fixes a too large allocation in setup. The window
function buffer allocation is common for all channels. It should
not be multiplied by channels count.

This change saves 17 MCPS (from 63 MCPS to 46 MCPS). The test
was done with script run:

scripts/rebuild-testbench.sh -p mtl
scripts/sof-testbench-helper.sh -x -m stft_process_1024_256_ \
  -p profile-stft_process.txt

The above STFT used FFT length 1024 with hop 256.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch removes fill_start_idx member from struct stft_process_fft.
It would have required another check for data align and samples amount
for Xtensa HIFI SIMD code version. There is no need for different FFT
padding types (left, center, right as in MFCC) in this component, so
it's safe to remove.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
@singalsu singalsu force-pushed the stft_add_hifi_optimize branch from 68005ba to f51925c Compare March 25, 2026 17:42
@singalsu singalsu requested a review from Copilot March 25, 2026 17:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants